perf: eager-load params when streaming on cpu backend by fszontagh · Pull Request #1687 · leejet/stable-diffusion.cpp

fszontagh · 2026-06-21T15:50:38Z

Summary

After #1644, model weights are loaded from disk lazily on the first prepare_params call. For apps that pre-load the model (long-lived servers, batched generation) this means the actual disk I/O happens during the first sampling step rather than during the explicit model-load call. With --stream-layers on a large CPU-backed model, that first step pays the full multi-segment disk read while the host expects "the model is loaded, generation should be fast now."

This PR adds an --eager-load flag (off by default). When set, all registered params are loaded into the params backend right after metadata validation, so subsequent prepare_params calls fast-path and the I/O cost is paid at model-load time instead of during sampling. No behavior change for users who don't pass the flag.

Numbers

RTX 3060 12 GB, --offload-to-cpu --stream-layers --max-vram -1:

Workload	Default (lazy)	`--eager-load`
Z-Image bf16 1024x688 batch=2 9 steps generate_image	244 s	63 s
Qwen Image Edit Q8 1024x688 20 steps generate_image	40+ min stall	68 s

The disk-read work isn't avoided, only relocated from "first sampling step" to "model load." For interactive / long-lived apps this is the user-visible win.

Checklist

I have read and confirmed this PR follows the contribution guidelines.

leejet · 2026-06-21T16:39:13Z

I think it would be better to control this with a flag, such as --eager-load, and leave it disabled by default.

fszontagh · 2026-06-21T16:57:43Z

Done in 5f817c0. Switched to a public --eager-load flag, defaults to false. The lazy path stays the upstream default; users opt in when they want the I/O paid at model-load time.

…treaming

leejet · 2026-06-22T14:10:21Z

Thank you for your contribution.

perf: --eager-load to pre-load params at model-load time

5f817c0

fszontagh force-pushed the perf/auto-eager-streaming branch from df69b60 to 5f817c0 Compare June 21, 2026 16:57

Merge remote-tracking branch 'upstream/master' into perf/auto-eager-s…

b357892

…treaming

leejet merged commit 787d229 into leejet:master Jun 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: eager-load params when streaming on cpu backend#1687

perf: eager-load params when streaming on cpu backend#1687
leejet merged 2 commits into
leejet:masterfrom
fszontagh:perf/auto-eager-streaming

fszontagh commented Jun 21, 2026 •

edited

Loading

Uh oh!

leejet commented Jun 21, 2026

Uh oh!

fszontagh commented Jun 21, 2026

Uh oh!

leejet commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

fszontagh commented Jun 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related

Numbers

Checklist

Uh oh!

leejet commented Jun 21, 2026

Uh oh!

fszontagh commented Jun 21, 2026

Uh oh!

leejet commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fszontagh commented Jun 21, 2026 •

edited

Loading